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Abstract 

In this paper, we study convex analysis and its theoretical applications. We apply important 

tools of convex analysis to Optimization and to Analysis. Then wc show various deep appli- 
cations of convex analysis and especially infimal convolution in Monotone Operator Theory. 
Among other things, we recapture the Minty surjectivity theorem in Hilbert space, and present 
a new proof of the sum theorem in reflexive spaces. More technically, we also discuss auto- 
conjugate representers for maximally monotone operators. Finally, we consider various other 
applications in mathematical analysis. 
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1 Introduction 

While other articles in this collection look at the applications of Moreau's seminal work, we have 
opted to illustrate the power of his ideas theoretically within optimization theory and within math- 
ematics more generally. Space constraints preclude being comprehensive, but we think the presen- 
tation made shows how elegant modern analysis can be made thanks to the work of Jean-Jacques 
Moreau and others. 
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1.1 Preliminaries 



Let X be a real Banach space with norm || • || and dual norm || • ||^. When there is no ambiguity 
we suppress the *. We write X* and ( • , • ) for the real dual space of continuous linear functions 
and the duality paring, respectively, and denote the closed unit ball by Bx := {x G X | ||x|| < 1} 
and set N := {1, 2, 3, . . .}. We identify X with its canonical image in the bidual space X** . A set 
C C X is said to be convex if it contains all line segments between its members: Ax + (1 — \)y G C 
whenever x,y £ C and < A < 1. 

Given a subset C of X, int C is the interior of C and C is the norm closure of C. For a set 

^* 

D C X* , D is the weak* closure of D. The indicator function of C, written as lq, is defined at 
X e X hy 



(1) Lc{x) :-- 



0, if X G C; 
+00, otherwise. 



The support function of C, written as ac, is defined by ac{x*) := sup^^q{c, x*) . There is also a 
naturally associated (metric) distance function, that is, 

(2) dc{x):=mi{\\x-y\\\yeC}. 

Distance functions play a central role in convex analysis, both theoretically and algorithmically. 

Let / : X — >• ]— oo, +00] be a function. Then dom/ := /^-"^(M) is the domain of /, and the lower 
level sets of a function / : X — )• ]— cxd,+oo] are the sets {x € X \ f{x) < a} where a G M. The 
epigraph of / is epi / := {(x, r) G X x M | f(x) < r}. We will denote the set of points of continuity 
of / by cont /. The function / is said to be convex if for any x, y G dom / and any A G [0, 1], one 
has 

/(Ax + (l-A)y)<A/(x) + (l-A)/(y). 
We say / is proper if dom / 7^ 0. Let / be proper. The suhdifferential of / is defined by 

df: X ^ X*: x^ {x* G X* I (x*,y-x) < f{y)-f{x), for ah y G X}. 

By the definition of df, even when x G dom/, it is possible that df{x) may be empty. For example 
9/(0) = for /(x) := —y/x whenever x > and /(x) := +00 otherwise. If x* G df{x) then x* is 
said to be a subgradient of / at x. An important example of a suhdifferential is the normal cone 
to a convex set C C X at a point x G C which is defined as Nc{x) := dic{x). 

Let g: X ^ ]— co, +00]. Then the inf- convolution fdg is the function defined on X by 

fBg: X H> inf {f{y) + g{x - y)}. 

yex 

(In [42j Moreau studied inf-convolution when X is an arbitrary commutative semigroup.) Notice 
that, if both / and g are convex, so it is fOg (see, e.g., [IHl p. 17]). 

We will say a function / : X — )■ ]— 00, +00] is Lipschitz on a subset D of X if there is a constant 
M > so that |/(x) — f{y)\ < M\\x — y\\ for all x,y £ D. In this case M is said to be a Lipschitz 
constant for f on D. If for each xq £ D, there is an open set U C D with xq G U and a constant 
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M so that — /(y)| < M\\x — y\\ for all x,y £ U, we will say / is locally Lipschitz on D. If D 

is the entire space, we simply say / is Lipschitz or locally Lipschitz respectively. 

Consider a function / : X — t- ]— oo,+c«]; we say / is lower- semicontinuous (Isc) if 
\\m'm.ix^x f{x) > f{x) for all x G X, or equivalently, if epi/ is closed. The function / is said 
to be sequentially weakly lower semi-continuous if for every x £ X and every sequence {xn)neN 
which is weakly convergent to x, one has liminfn_>oo /(a^n) > f{^)- This is a useful distinction 
since there are infinite dimensional Banach spaces {Schur spaces such as l^) in which weak and 
norm convergence coincide for sequences, see [20, p. 384, esp. Thm 8.2.5]. 

1.2 Structure of this paper 

The remainder of this paper is organized as follows. In Section [2| we describe results about Fenchel 
conjugates and the subdifferential operator, such as Fenchel duality, the Sandwich theorem, etc. 
We also look at some interesting convex functions and inequalities. In Section [3| we discuss the 
Chebyshev problem from abstract approximation. In Section |4j we show applications of convex 
analysis in Monotone Operator Theory. We reprise such results as the Minty surjectivity theorem, 
and present a new proof of the sum theorem in reflexive spaces. We also discuss Fitzpatrick's 
problem on so called autoconjugate representers for maximally monotone operators. In Section [5] 
we discuss various other applications. 

2 Subdifferential operators, conjugate functions &; Fenchel duality 

We begin with some fundamental properties of convex sets and convex functions. While many 
results hold in all locally convex spaces, some of the most important such as |(iv)[ b) in the next 
Fact do not. 

Fact 2.1 (Basic properties [2Q1 Ch. 2 and 4].) The following hold. 

(i) The (Isc) convex functions form a convex cone closed under pointwise suprema: if f^y is convex 
(and Isc) for each 7 G F then so is x ^ sup^gp f^{x). 

(ii) A function f is convex if and only if epi / is convex if and only if L^pi / is convex. 

(iii) Global minima and local minima coincide for convex functions. 

(iv) (a) A proper convex function is locally Lipschitz if and only if it is continuous if and only 
if it is locally bounded, (b) Additionally, if the function is lower semicontinuous, then it is 
continuous at every point in the interior of its domain. 

(v) A proper lower semicontinuous and convex function is bounded from below by a continuous 
affine function. 

(vi) IfC is a nonempty set, then dc(-) is non-expansive (i.e., is a Lipschitz function with constant 
one). Additionally, if C is convex, then dc{-) is a convex function. 

(vii) If C is a convex set, then C is weakly closed if and only if it is norm closed. 
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(viii) Three-slope inequality; Suppose / : M — )•] — oo, oo] is convex and a < b < c. Then 

fib) - f{a) < m - m ^ f{c)-f{b) ^ 
b — a ~ c — a ~ c — b 

The following trivial fact shows the fundamental significance of subgradients in optimization. 

Proposition 2.2 (SubdifTerential at optimality) Let / : X — t- ]— oo,+oo] be a proper convex 
function. Then the point x £ X is a local minimizer of f if and only if £ df{x). 

The directional derivative of / at x G dom / in the direction d is defined by 

t-s>0+ t 

if the limit exists. If / is convex, the directional derivative is everywhere finite at any point of 
intdom/, and it turns out to be Lipschitz at cont /. We use the term directional derivative with 
the understanding that it is actually a one-sided directional derivative. 

If the directional derivative f'{x,d) exists for all directions d and the operator f'{x) defined 
by (/'(x),-) := /'(x;-) is linear and bounded, then we say that / is Gateaux differentiable at x, 
and /'(x) is called the Gateaux derivative. Every function f : X ^ ]— oo,-|-oo] which is lower 
semicontinuous, convex and Gateaux differentiable at x, it is continuous at x. Additionally, the 
following properties are relevant for the existence and uniqueness of the subgradients. 

Proposition 2.3 (See [201 Fact 4.2.4 and Corollary 4.2.5].) Suppose / : X — )• ]— oo, +oo] is convex. 

(i) /// is Gateaux differentiable at x, then f'{x) G df{x). 

(ii) /// is continuous at x, then f is Gateaux differentiable at x if and only if df{x) is a singleton. 



Example 2.4 We show that part (ii) in Proposition 2.3 is not always true in infinite dimensions 
without continuity hypotheses. 

(a) The indicator of the Hilbert cube C := {x = (xi, X2, . . .) £ : |x„| < 1/n, Vn € N} at zero or 
any other non-support point has a unique subgradient but is nowhere Gateaux differentiable. 

(b) Boltzmann- Shannon entropy x i— )• x{t) log(x(t))dt viewed as a lower semicontinuous and 
convex function on L^[0, 1] has unique subgradients at x{t) > a.e. but is nowhere Gateaux 
differentiable (which for a lower semicontinuous and convex function in Banach space implies 
continuity) . 

That Gateaux differentiability of a convex closed function implies continuity at the point is a 
consequence of the Baire category theorem. 

The next result proved by Moreau in 1963 establishes the relationship between subgradients and 
directional derivatives, see also |461 page 65]. Proofs can be also found in most of the books in 
variational analysis, see e.g. [23l Theorem 4.2.7]. 
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Theorem 2.5 (Moreau's max formula |43| ) Let / : X — t- ]— oo, +oo] be a convex function and 
let d £ X . Suppose that f is continuous at x. Then, df{x) ^ and 

(3) f'{x- d) = max{(x*, d) | x* G df{x)]. 

Let / : X — 7- [— oo,+oo]. The Fenchel conjugate (also called the Legendre-Fenchel conjugation!: 
transform) of / is the function /* : X* — t- [— oo, +cxd] defined by 

r{x*):= sup{(x*,x)-/(x)}. 

We can also consider the conjugate of /* called the biconjugate of / and denoted by /**. This is a 
convex function on X** satisfying f**\x ^ /• A useful and instructive example is ac = Lq- 

Example 2.6 Let 1 < p < oo . If f{x) := for x £ X then f*{x*) = J^^, where i + i = 1. 
Indeed, for any x* £ X* , one has 

f*{x*) = sup sup <.{x*,Xx) — = sup iA||a;*||* 1 = - — 

AeK+ ||x||=i I P } xm+ [ P } Q 





By direct construction and Fact 2.1| (i) the conjugate function /* is always convex and closed, 
and if the domain of / is nonempty, then /* never takes the value — oo. The conjugate plays a role 
in convex analysis in many ways analogous to the role played by the Fourier transform in harmonic 
analysis. 

2.1 Inequalities and their applications 

An immediate consequence of the definition is that for f,g:X^ [~c>o, +oo], the inequality f > g 
implies /* < g* ■ An important result which is straightforward to prove is the following. 

Proposition 2.7 (Fenchel— Young) Let / : X — )• ]— oo, +oo]. All points x* G X* and x G dom/ 

satisfy the inequality 

(4) f{x) + f*ix*)>{x*,x). 

Equality holds if and only if x* £ df{x). 



Example 2.8 (Young's inequality) By taking / as in Example 2.6, one directly obtains from 
Proposition |2.7| 

H > {x ,x , 

p q 

for all x £ X and x* £ X* , where p > 1 and ^ + ^ = 1 • When X = M one recovers the original 
Young inequality. 



^Originally the connection was made between a monotone function on an interval and its inverse. The convex 
functions then arise by integration. 
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This in turn leads to one of the workhorses of modern analysis: 

Example 2.9 (Holder's inequality) Let / and g be measurable on a measure space (X, /u). 
Then 

(5) / fgdfi<\\fUg\\„ 

Jx 

where 1 < p < oo and ^ + ^ = 1. Indeed, by rescaling, we may assume without loss of generality 
that ll/llp = \\g\\q = 1- Then Young's inequality in Example 2.8 yields 

P Q 

and ([5]) follows by integrating both sides. The result holds true in the limit for p = lorp = oo. 

We next take a brief excursion into special function theory and normed space geometry to 
emphasize that "convex functions are everywhere." 

Example 2.10 (Bohr— MoUerup theorem) The Gamma function defined for x > as 



r(x) := / e-H^-'dt = lim 







00 x{x + 1) ■ ■ ■ (x + n) 



is the unique function / mapping the positive half-line to itself and such that (a) /(I) = 1, (b) 
xf{x) = f{x + 1) and (c) log / is a convex function. 

Indeed, clearly r(l) = 1, and it is easy to prove (b) for T by using integration by parts. In order 
to show that logT is convex, pick any x,y > and A G (0, 1) and apply Holder's inequality ([5]) 
with p = 1/A to the functions t H> ^'^H^^'-^^^'^ and t H> e~(^~^)*t(^~'^)(^"^). For the converse, let 
g := log/. Then (a) and (b) imply g{n + l) = log(n!). Convexity of g together with the three-slope 
inequality, see Fact 2.] [viii)[ implies that 

, , q(n + 1 + x) — q{n + 1) , , , 

g{n + 1) - g{n) < ^ — ^ < g{n + 2 + x) - g{n + I + x), 

X 

and hence, 

X log(n) < log {x{x + 1) • • • (x + n)f{x)) — log(n!) < x log(n + 1 + x); 

whence, 

/ n\n^ \ , / 1 + X 

< g{x) — log ( — — r < X log ( 1 + 



^x{x + 1) ■ ■ ■ {x + n) J \ n 

Taking limits when n — )• 00 we obtain 

fix) = lim — — = Tix). 

n-5-oo x[x + 1) ■ ■ ■ [X + n) 

As a bonus we recover a classical and important limit formula for T{x). 
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Application of the Bohr-Mollerup theorem is often automatable in a computer algebra system, 
as we now illustrate. Consider the beta function 

(6) /3(x,y) := f\--\l - t)y-Ut 

Jo 

for Re(a;), Re(?/) > 0. As is often established using polar coordinates and double integrals 

T{x + y) 

We may use the Bohr-Mollerup theorem with 

f ■.= x^ p{x,y)T{x + y)/T{y) 

to prove ([T]) for real x, y. 

Now (a) and (b) from Example 2.10 are easy to verify. For (c) we again use Holder's inequality 
to show / is log-convex. Thus, / = F as required. 

Example 2.11 (Blaschke— Santalo theorem) The volume of a ball in the || • ||p-norm, Vn{p) is 

r(i + 



(8) Vn{p) = T- 



v 



r(i + f)" 

as was first determined by Dirichlet. When p = 2, this gives 



r(i + f) r(i + f)' 

which is more concise than that usually recorded in texts. 

Let C be convex body in M", that is, a closed bounded convex set with nonempty interior. 
Denoting n-dimensional Euclidean volume of S" C M" by Vn{S), the Blaschke-Santalo inequality 
says 

(9) Vn{C)Vn{C°) < Vn{E)Vn{E°) = V^{Bn{2)) 

where maximality holds (only) for any ellipsoid E and Bn{2) is the Euclidean unit ball. It is 
conjectured the minimum is attained by the 1-ball and the oo-ball. Here as always the polar set is 
defined by C° := {yeW: {y, x) < 1 for aU x G C}. 

The p-ball case of ^ follows by proving the following convexity result: 

Theorem 2.12 (Harmonic-arithmetic log-concavity) The function 

v^ip) :=2-r(i + iy/r (i + ^ 

satisfies 

(10) Va{p)^Va{q)^-^ < V^l ^ 
for all a > 1, if p,q > 1, p q, and A € (0, 1). 



A I 1^ 

p q . 
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Set a := n, - + - = 1 with A = l — A = l/2to recover the p— norm case of the Blaschke-Santalo 
inequahty. It is amusing to deduce the corresponding lower bound. This technique extends to 
various substitution norms. Further details may be found in [14, §5.5]. Note that we may easily 
explore Va{p) graphically. 

2.2 The biconjugate and duality 

The next result has been associated by different authors with the names of Legendre, Fenchel, 
Moreau and Hormander; see, e.g., [20t Proposition 4.4.2]. 

Proposition 2.13 (Hormander) Let f : X ^ ]— oo,+oo] be a proper function. Then 

f is convex and lower semicontinuous <^ f = /**. 
Example 2.14 (Establishing convexity) (See [101, Theorem 1].) We may compute conjugates 



by hand or using the software SCAT [18]. This is discussed further in Section 5.3, Consider 
f{x) := e^. Then f*{x) = xlog(x) — x for x > (taken to be zero at zero) and is infinite for x > 0. 
This establishes the convexity of xlog(x) — x in a way that takes no knowledge of xlog(x). 

A more challenging case is the following (slightly corrected) conjugation formula [19i. p. 94, 
Ex. 13] which can be computed algorithmically: Given real ai, 02, . . . , am > 0, define a := ai 
and suppose a real /i satisfies fj, > a + 1. Now define a function / : M™ x M 1— )• ]— 00, +00] by 

V^^s^riiO if a; G M!p+, s G M+; 



f{x,s) := 
it transpires that 



if 3xi = 0, X G 

+00 otherwise. 



Vx := iXn)n=l G M™, s G 



if y G M!!^, t G M_ , Vy := {yn)n=i G M"^, i G 

+00 otherwise 



for constants 

P Tf 



/^-(a + 1)' /i-(a + l)' ^' 



We deduce that / = /**, whence / (and /*) is (essentially strictly) convex. For attractive alterna- 
tive proof of convexity see [393 • Many other substantive examples are to be found in [ 19^ |20] . (> 

The next theorem gives us a remarkable sufficient condition for convexity of functions in terms 
of the Gateaux differentiability of the conjugate. 

Theorem 2.15 (See [20^ Corollary 4.5.2].) Suppose f : X ^ ]-oo,-|-oo] is such that f** is 
proper. If f* is Gateaux differentiahle at all x* G domdf* and f is sequentially weakly lower 
semicontinuous, then f is convex. 
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Let / : X ^ ] 

if limii^ii^oo 



-oo, +00]. We say / is coercive if limH 



f{x) = +00. We say / is supercoercive 



-00. 



Fact 2.16 (See |20^ Fact 4.4.8].) /// is proper convex and lower semicontinuous at some point in 
its domain, then the following statements are equivalent. 

(i) / is coercive. 

(ii) There exist a > and /3 G M such that / > a|| • || + f3. 

(iii) lim inf /(x)/||x|| >0. 

(iv) / has bounded lower level sets. 

Because a convex function is continuous at a point if and only if it is bounded above on a 
neighborhood of that point (Fact 2.1 iv) ), we get the following result; see also |35l Theorem 7] for 
the case of the indicator function of a bounded convex set. 

Theorem 2.17 (Hormander Moreau Rockafellar) Let / : X — )• ]— oo,+oo] be convex and 
lower semicontinuous at some point in its domain, and let x* £ X* . Then f — x* is coercive if and 
only if f* is continuous at x* . 



Proof. By Fact |2.16| there exist a > and /3 G M such that / > x* + a|| • || + /3. Then 

/* < — /? + i[x*+aBx''}-' fro™ where x* + aBx* ^ dom/*. Therefore, /* is continuous at x* by 
Fact |2.fi 



IV 



By the assumption, there exists /3 G M and 6 > such that 

f*{x* + z*)<P, yz*G6Bx*. 



Thus, by Proposition 2.7 



{x* + z\y)-f{y)<l3, yz*G6Bx',yy£X; 
whence, taking the supremum with z* G 5Bx* , 

6\\y\\-(3<f{y)-{x*,y), \fy e X. 



Then, by Fact 2.16, f — x* is coercive. I 

Example 2.18 Given a set C in X, recall that the negative polar cone of C is the convex cone 

C- := {x* G X* I sup(x*,C) < 0}. 

Let K he a, closed convex cone. Then K~ is another nonempty closed convex cone with K~~ = K. 
Moreover, the indicator function of K and K~ are conjugate to each other. If we set / := i^-, the 
indicator function of the negative polar cone of K, Theorem |2.17 applies to get that 

X G int K if and only if the set {x* G K~ \ {x* , x) > a} is bounded for any a G M. 
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2.17 



2.16 



Indeed, since x E inti^ = intdomi^_ if and only if is continuous at x, from Theorem 
we have that this is true if and only if the function lj^- — x is coercive. Now, Proposition 
assures us that coerciveness is equivalent to boundedness of the lower level sets, which implies the 
assertion. 



Theorem 2.19 (Moreau Rockafellar duality [44j) Let / : X — )• (— oo,+oo] be a lower semi- 
continuous convex function. Then f is continuous at if and only if f* has weak* -compact lower 
level sets. 

Proof. Observe that / is continuous at if and only if /** is continuous at ([20, Fact 4.4.4(b)] )if 
and only if /* is coercive (Theorem 2.17) if and only if /* has bounded lower level sets (Fact 2.16| ) 



if and only if /* has weak*-compact lower level sets by the Banach-Alaoglu theorem (see 
Theorem 3.15]). ■ 

Theorem 2.20 (Conjugates of supercoercive functions) Suppose f : X ^ ]— cxd,+oo] is a 
lower semicontinuous and proper convex function. Then 

(a) f is supercoercive if and only if f* is bounded (above) on a neighbourhood ofO. 

(b) f is bounded (above) on a neighbourhood of if and only if f* is supercoercive. 

Proof, (a) "=^": Given any q > 0, there exists M such that f{x) > a\\x\\ if ||x|| > M. Now there 
exists /3 > such that f{x) > -/3 if ||x|| < M by Fact |2.fv)[ Therefore / > a|| • || + (-/?). Thus, 



it implies that /* < a{\\ ■ ||)*(-) + (3 and hence /* < /3 on aBx*- 

"<^": Let a > 0. Now there exists K such that f*<Kon aBx*. Then / > a|| • || — -fC and so 

liminf||^||^oo |^ > «• 

(b): According to (a), /* is supercoercive if and only if /** is bounded on a neighbourhood of 0. 
By |20| Fact 4.4.4(a)] this holds if and only if / is bounded (above) on a neighbourhood of 0. ■ 

We finish this subsection by recalling some properties of infimal convolutions. Some of their 
many applications include smoothing techniques and approximation. We shall meet them again 
in Section |4| Let f,g : X ^ ]— oo,+oo]. Geometrically, the infimal convolution of / and g is 
the largest extended real- valued function whose epigraph contains the sum of epigraphs of / and 
g (see example in Figure [T]), consequently it is a convex function. The following is a useful result 
concerning the conjugate of the infimal convolution. 

Fact 2.21 (See |20l Lemma 4.4.15] and |461 pp. 37-38].) // / and g are proper functions on X, 
then (fOg)* = f*+g*. Additionally, suppose f,g are convex and bounded below. 7/ / : X — >• M 
is continuous (resp. bounded on bounded sets, Lipschitz), then fOg is a convex function that is 
continuous (resp. bounded on bounded sets, Lipschitz). 

Remark 2.22 Suppose C is a nonempty convex set. Then dc = || • HDic", implying that dc is a 
Lipschitz convex function. 
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Example 2.23 Consider /, (7 : M — )• ]— 00, +00] given by 

J,, . ( -Vl - for - 1 < X < 1, , , . , , 

fix) := < , . ana gix) := \x\. 

y +00 otherwise, 

The infimal convolution of / and g is 

I \x\ — v2, otherwise, 
as shown in Figure [T} 




2.3 The Hahn-Banach circle 

Let T : X — )• y be a linear mapping between two Banach spaces X and y. The adjoint of T is the 
linear mapping A* '.Y* ^ X* defined, for y* G y*, by 

{T*y* , x) = {y* , Tx) for all x G X. 

A flexible modern version of Fenchel's celebrated duality theorem is: 

Theorem 2.24 (Fenchel duality) Let Y be another Banach space, let f : X ^ ]— oo,+cxd] and 
g:Y^ ]— oo,+oo] be convex functions and letT: X ^ Y be a bounded linear operator. Define 
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the primal and dual values p,d & [—00, +00] by solving the Fenchel problems 

P ■= inf {/(x) + g{Tx)} 

(11) d:= sup {-f*{T*y*)-g*{-y*)}. 

yeY* 

Then these values satisfy the weak duality inequality p > d. 
Suppose further that f , g and T satisfy either 

(12) A [dovug — Tdom/] = Y and both f and g are lower semicontinuous, 
A>0 

or the condition 

(13) cont^nTdom/ / 0. 



Then p = d, and the supremum in the dual problem (11) is attained when finite. Moreover, the 
perturbation function h{u) := mix f{x) + g{Tx + u) is convex and continuous at zero. 

Generalized Fenchel duality results can be found in |25| I24j. An easy consequence is: 



Corollary 2.25 (Infimal convolution) Under the hypotheses of the Fenchel duality theorem 2.i 
if + g)*{x*) = {f*Og*)ix*) with attainment when finite. 

Another nice consequence of Fenchel duality is the ability to obtain primal solutions from dual 
ones, as we now record. 



Corollary 2.26 Suppose the conditions for equality in the Fenchel duality Theorem \2.24\ hold, and 
that y* G Y* is an optimal dual solution. Then the point x G X is optimal for the primal problem 
if and only if it satisfies the two conditions x G df*(T*y*) and Tx £ dg*{—y*). 

The regularity conditions in Fenchel duality theorem can be weakened when each functions is 
polyhedral, i.e., when their epigraph is polyhedral. 

Theorem 2.27 (Polyhedral Fenchel duality) (See [rQ", Corollary 5.1.10].) The conclusions of 
the Fenchel duality Theorem 2.24 remain valid if the regularity condition (12) is replaced by the 
assumption that the functions f and g are polyhedral with 

dom g DT dom / 7^ 0. 

Fenchel duality applied to a linear programming program yields the well-known Lagrangian 
duality. 

Corollary 2.28 (Linear programming duality) Given c S M", b G and A an m x n real 
matrix, one has 

(14) inf {c'^x \Ax<b}> sup {-b^ X \ A'^ X = -c}, 
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where := {(xi,X2,-"" ,Xm) | 2;^ > 0, i = 1,2,--- ,m}. Equahty in (14) holds if 6 G ran74. 
Moreover, both extrema are obtained when finite. 

Proof. Take f{x) := c'^x, T := A and g{y) := ib>{y) where b> := {y G M*" | y < Then apply 
the polyhedral Fenchel duality Theorem |2.27| observing that f* = tc, and for any A G M™, 



g*{X) = supy^A 
y<b 



b^X, if A G M™; 
+00, otherwise; 



and (14) follows, since domg D Adorn f = {x G M" | Ax < b}. I 

One can easily derive various relevant results from Fenchel duality, such as the Sandwich theorem, 
the subdifferential sum rule, and the Hahn-Banach extension theorem, among many others. 

Theorem 2.29 (Extended sandwich theorem) Let X and Y be Banach spaces and let T : 
X ^ Y be a bounded linear mapping. Suppose that / : X — )• ]— 00, +00], g : Y ^ ]—oo, +00] are 
proper convex functions which together with T satisfy either ( |12[ ) or (13). Assume that f > —goT. 
Then there is an affine function a : X — )■ M 0/ the form a{x) = {T*y* , x) -\- r satisfying f > a > 
—go T. Moreover, for any x satisfying f{x) = {—go T)[x), we have —y* G dg{Tx). 

Proof. With notation as in the Fenchel duality Theorem |2.24 we know d = p, and since p > 
because f{x) > —g(Tx), the supremum in d is attained. Therefore there exists y* G Y* such that 

0<p = d=-f*{T*y*)-g*i-y*). 

Then, by Fenchel- Young inequality (|4]), we obtain 

(15) 0<p< fix) - {T*y*,x) + g{y) + {y*,y), 

for any x £ X and y £Y. For any z €z X, setting y = Tz in the previous inequality, we obtain 

a := snp[-g{Tz) - {T*y*,z)] < b := inf [/(x) - {T*y*,x)] 
zex 

Now choose r G [a, b]. The affine function a{x) := {T*y* ,x) +r satisfies f > a > —goT, as claimed. 

The last assertion follows from (15) simply by setting x = x, where x satisfies f{x) = {—goT){x). 
Then we have supj,ey{(-y*,y)-g(y)} < {-goT){x)- {T*y\x) . T\ms g*{-y*)+g{Tx) < -{y*,Tx) 
and hence —y* G dg{Tx). ■ 

When X = Y and T is the identity we recover the classical Sandwich theorem. The next example 
shows that without a constraint qualification, the sandwich theorem may fail. 

Example 2.30 Consider /, (7 : M — )• ]— 00, +00] given by 

„, s. ( —\/—x, for X < 0, , . . f — -v/x, for x > 0, 

fix) := < • and qix) := < ^, 

[ +00 otherwise, [ +00 otherwise. 

In this case, IJa>o ^ [domr^ — dom/] = [0, +00 [ 7^ M and it is not difficult to prove there is not any 
affine function which separates / and —g, see Figure [2| 
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The prior constraint qualifications are sufficient but not necessary for tlie sandwich theorem as 
we illustrate in the next example. 

Example 2.31 Let f,g : M — t- ]— c»,+oo] be given by 

s (I, for X > 0, , . ( -\, for x < 0, 

j{x) := < ^ ^, . and g(x) ■= ^ ^ 

[ +00 otherwise, [ +c« otherwise. 

Despite that IJ^^^q A [dom (7 — dom /] = ]— oo,0[ 7^ M, the affine function a{x) := —x satisfies 
f > a> —g, see Figure [2} 



Figure 2: On the left we show the failure of the sandwich theorem in the absence of the constraint 
qualification; of the right we show that the constraint qualification is not necessary. 



Theorem 2.32 (Subdifferential sum rule) Let X and Y he Banach spaces, and let f : X ^ 
]— oo,+oo] and g : Y ^ ]— oo,+oo] be convex functions and let T : X ^ Y he a hounded linear 
mapping. Then at any point x £ X we have the sum rule 

d{f + goT){x)Ddf{x) + T*{dg{Tx)) 



with equality if (12) or (13) hold. 



Proof. The inclusion is straightforward by using the definition of the subdifferential, so we prove 
the reverse inclusion. Fix any x G X and let x* € d{f + goT){x). Then € d{f — {x* , ■ ) + goT){x). 
Conditions for the equality in Theorem 2.24 are satisfied for the functions /(•) — (x*, • ) and g. Thus, 
there exists y* G Y* such that 



fix) - {x*,x) + giTx) = -f*{T*y* + x*) - g*{-y*). 
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Now set z* := T*y* + x*. Hence, by Fenchel- Young inequality Q, one has 

< fix) + riz*) - {z*,x) = -g{Tx) - g*{-y*) - {T*y*,x) < 0; 

whence, 

f{x) + r{z*) = {z*,x) 

g{Tx)+g*{-y*) = {-y\Tx). 

Therefore equahty in Fenchel- Young occurs, and one has z* € df{x) and —y* £ dg{Tx), which 
completes the proof. ■ 
The subdifferential sum rule for two convex functions with a finite common point where one of 
them is continuous was proved by Rockafellar in 1966 with an argumentation based on Fenchel 
duality, see \52\ Th. 3]. In an earlier work in 1963, Moreau [43] proved the subdifferential sum rule 
for a pair of convex and Isc functions, in the case that infimal convolution of the conjugate functions 
is achieved, see \4:6\ p. 63] for more details. Moreau actually proved this result for functions which 
are the supremum of a family of affine continuous linear functions, a set which agrees with the 
convex and Isc functions when X is a locally convex vector space, see [H] or [461 p. 28]. See 
also j33t [5^ HEl [T7] for more information about the subdifferential calculus rule. 

Theorem 2.33 (Hahn— Banach extension) Let X be a Banach space and let f : X ^ M be a 

continuous sublinear function with dom/ = X. Suppose that L is a linear subspace of X and the 
function h: L — ?■ M is linear and dominated by f , that is, f > h on L. Then there exists x* G X* , 
dominated by f , such that 

h{x) = {x*,x), for all X £ L. 



Proof. Take g := —h + ti and apply Theorem 2.24 to / and g with T the identity mapping. Then, 
there exists x* £ X* such that 

< inf {fix) -h{x) + LLix)} 

= -f*{x*) - sup{{-x*,x) + h{x) - LL^x)} 

(16) = -f*{x*) + mf{{x*,x) -h{x)}; 

xdL 

whence, 

f*{x*) < {x*,x) - h{x), for ah x £ L. 

Observe that f*{x*) > since /(O) = 0. Thus, being L a linear subspace, we deduce from the 
above inequality that 

h{x) = {x*,x), for all x £ L. 



Then (16) implies f*{x*) = 0, from where 

fix) > {x*,x), for all x £ X, 

and we are done. 
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Remark 2.34 Moreau's max formula (Theorem 2.5 ) — a true child of Cauchy's principle of steepest 



descent — can be also derived from Fenchel duality. In fact, the non-emptiness of the subdifferential 
at a point of continuity, Moreau's max formula, Fenchel duality, the Sandwich theorem, the subd- 
ifferential sum rule, and Hahn-Banach extension theorem are all equivalent, in the sense that they 
are easily inter-derivable. 

In outline, one considers h(u) := inf^/(x) + g{Ax + u) and checks that dh{0) ^ implies the 



Fenchel and Lagrangian duality results; while condition [12] and so 13 implies h is continuous at 



zero and thus Theorem 2.5 finishes the proof. Likewise, the polyhedral calculus [19, §5.1] implies 
h is polyhedral when / and g are and shows that polyhedral functions have dom h = dom dh. This 



establishes Theorem 2.27 This also recovers abstract LP duality (e.g., semidefinite programming 
duality) under condition 12 See [121 [20] for more details. 



Let us turn to two illustrations of the power of convex analysis within functional analysis. 
A Banach limit is a bounded linear functional A on the space of bounded sequences of real 
numbers such that 

(i) A((x„+i)„gN) = A((x„)„gN) (so it only depends on the sequence's tail), 

(ii) liminffcXfc < A((xa;)) < limsup^Xfe 

where {xn)neN = (^^i, X2, . . .) G and {xn+i)nen = {^2, X3, . . .). Thus A agrees with the limit on 
c, the subspace of sequences whose limit exists. Banach limits care peculiar objects! 

The Hahn-Banach extension theorem can be used show the existence of Banach limits (see 
Sucheston [61] or |20| Exercise 5.4.12]). Many of its earliest applications were to summability 
theory and related fields. We sketch Sucheston's proof as follows. 



Theorem 2.35 (Banach limits) (See [M]-) Banach limits exist. 
Proof. Let c be the subspace of convergent sequences in Define / 



by 



(17) 



X :- 



neN 



I— 7- lim 

n—^cx3 



\ J i=l 



'i+j 



Then / is sublinear with full domain, since the limit in (17) always exists (see |61l p. 309]). Define 



h on c hy h := lim„x„ for every x := {xn)nm in c. Hence h is linear and agrees with / on c. 
Applying the Hahn-Banach extension Theorem 2.33, there exists A G , dominated by /, such 
that A = h on c. Thus A extends the limit linearly from c to Let S denote the forward shift 
defined as S{{xn)n£N) ■= ixn+i)n£N- Note that f{Sx — x) = 0, since 



\fiSx 



lim 



1 



sup 



n 



< lim 



1 



sup I Xj 

j 



0. 



Thus, A{Sx) - A(x) = A{Sx - x) < 0, and A(x) - A{Sx) = A{x - Sx) < f{x - Sx) = 0; that is, A 
is indeed a Banach limit. ■ 
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Theorem 2.36 (Principle of uniform boundedness) (See (^20l Example 1.4.8].) Let Y be 

another Banach space and T^: X — )• Y for a €z A be bounded linear operators. Assume that 
supQ,g^ ||Ta(x)|| < +00 for each x in X . Then sup^g^^ \\Ta\\ < +00. 



Proof. Define a function f^ by 



fAix) 



sup 



for each x in X. Then, as observed in Fact 2.]|I) , is convex. It is also lower semicontinuous since 
each mapping x 1— )• ||Tci(x)|| is continuous. Hence is a finite, lower semicontinuous and convex 
(actually sublinear) function. Now Fact 2.]|iv) ensures /a is continuous at the origin. Select e > 
with sup{/yi(x) I ||x|| < e} < 1 + /a(0) = 1. It follows that 



sup 



sup - sup ||Tq(x) 

aSA £ |ja:|j<e 



- sup sup ||Tq,(x)|| < -. 

^ l|a:l|<eQGA £ 



Thus, uniform boundedness is revealed to be continuity of fA. 



3 The Chebyshev problem 

Let C be a nonempty subset of X. We define the nearest point mapping by 

Pc{x) := {v £ C \ \\v — x\\ = dc{x)}. 

A set C is said to be a Chebyshev set if Pc{x) is a singleton for every x £ X. If Pc{x) 7^ for 
every x € X, then C is said to be proximal; the term proximinal is also used. 

In 1961 Victor Klee |36j posed the following fundamental question: Is every Chebyshev set in a 
Hilbert space convex? At this stage, it is known that the answer is affirmative for weakly closed 
sets. In what follows we will present a proof of this fact via convex duality. To this end, we will 
make use of the following fairly simple lemma. 

Lemma 3.1 (See \20\ Proposition 4.5.8].) Let C be a weakly closed Chebyshev subset of a Hilbert 
space H . Then the nearest point mapping Pq is continuous. 

Theorem 3.2 Let C be a nonempty weakly closed subset of a Hilbert space H . Then C is convex 
if and only if C is a Chebyshev set. 

Proof. For the direct implication, we will begin by proving that C is proximal. We can and do 
suppose that G C Pick any x £ H. Consider the convex and Isc functions f{z) := —{x, z)+lb„{z) 
and g{z) := crc{z). Notice that Ua>o ^ [domg — dom/] = H. With the notation of Theorem 2.24 



one has p = d, and the supremum of the dual problem is attained if finite. Since f*{y) = \\x + y\ 



and g*{y) = i-ciu), as C is closed, the dual problem (11) takes the form 



d = sup{-||x + y\\ - ic{-y)} = - dcix). 

Choose any c £ C. Observe that < dcix) < \\x — c\\. Therefore the supremum must be attained, 
and Pc{x) 7^ 0. Uniqueness follows easily from the convexity of C. 
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For the converse, consider the function / := 



+ ic- We first show that 



(18) 

Indeed, for re G iJ, 



df*{x) = {Pc{x)], for all xeH. 



fix) 



sup <; {x,y) - \{y,y) 

ydC 

-{x,x) + - sup{-(a;,x) + 2{x,y) 
^ ^ y(^C 



y\\ 



- inf llx 
2 j/ec" 

'\x — Pc{x 



1, 



1 



2" " 2' 
{x,Pc{x))-f{Pc{x)). 



2" " 2 

: {x,Pcix)) 



\Pc{x) 



Consequently, by Proposition 2.7, Pcix) £ df*{x) for x € X. Now suppose y G df*(x), and define 



Xn = X + ^{y - Pc{x)). Then x„ 
subdifferential inequality, we have 



X, and hence Pc{xn) — t- -FbC^;) by Lemma 



3.1 



Using the 



< {Xn- X,PciXr. 



y) = -{y - Pc{x),Pc{xn) - y)- 

n 



This now implies: 



0< lim {y-Pc{x),Pc{xn) 

71— >00 



y) 



\y - Pc{x) 



Consequently, y = Pc{x) and so (18) is established. 



Since /* is continuous and we just proved that df* is a singleton, Proposition 2.3 implies that / 



is Gateaux differentiable. Now — oo < f**{x) < f{x) 



for all X G C. Thus, /** is a proper 



function. One can easily check that / is sequentially weakly Isc, C being weakly closed. Therefore, 
Theorem 2.15 implies that / is convex; whence, dom/ = C must be convex. ■ 
Observe that we have actually proved that every Chebyshev set with a continuous projection 
mapping is convex (and closed). We finish the section by recalling a simple but powerful "hidden 
convexity" result. 

Remark 3.3 (See [3j.) Let C be a closed subset of a Hilbert space H. Then there exists a 



continuous and convex function / defined on H such that d; 



f{x), Vx G H. Precisely, 



/ can be taken as x i— )• supcg(^{2(x, c) 



4 Monotone operator theory 

Let A: X ^ X* be a set-valued operator (also known as a relation, point-to-set mapping or 
multifunction), i.e., for every x G X, Ax C X*, and let gra^ := |(x,x*) G X x X* \ x* G Ax^ be 
the graph of A. The domain of A is dom^ := |x G X | Ax 7^ 0} and ran A := A{X) is the range 
of A. We say that A is monotone if 

(19) {x-y,x* -y*) >0, for all (x, x*), (y, y*) G gra A, 
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and maximally monotone if A is monotone and A has no proper monotone extension (in the sense 
of graph inclusion). Given A monotone, we say that {x,x*) G X x X* is monotonically related to 
giaA if 

{x -y,x* - y*) > 0, for all {y,y*) G gra^. 

Monotone operators have frequently shown themselves to be a key class of objects in both modern 
Optimization and Analysis; see, e.g., dH H Ell [22] , the books ia[20l[26l[50l[57l[58l[55l[Ml[Mll6l! 

and the references given therein. 

Given sets S C X and D Q X* , we define by := {x* G X* \ {x*,x) = 0, Vx G S} 
and D± by D± := {x £ X \ {x,x*) = 0, Vx* G D} [5T]. Then the adjoint of A is the operator 
A* : X** ^ X* such that 

gmA* := {{x**,x*) e X** x X* \ {x*,-x**) G (graA)^}. 

Note that the adjoint is always a linear relation, i.e. its graph is a linear subspace. 

The Fitzpatrick function [31j associated with an operator A is the function Fa '■ X x X* — )■ 
]— oo,+oo] defined by 

(20) Fa{x,x*):= sup ({x,a*) + {a,x*)-{a,a*)). 

(a,a*)egraA ^ ^ 

Fitzpatrick functions have been proved to be an important tool in modern monotone operator 
theory. One of the main reasons is shown in the following result. 

Fact 4.1 (Fitzpatrick) (See ([Ml Propositions 3.2&4.2, Theorem 3.4 and Corollary 3.9].) Let 
A: X ^ X* be monotone with domA ^ 0. Then Fa is proper lower semicontinuous, convex, 
and Fa = {■, ■) on gvaA. Moreover, if A is maximally monotone, for every {x,x*) G X x X* , the 
inequality 

{x,x*) < Fa{x,x*) < FX{x*,x) 
is true, and the first equality holds if and only if {x,x*) G graA. 

The next result is central to maximal monotone operator theory and algorithmic analysis. Orig- 
inally it was proved by direct and harder methods than the concise convex analysis argument we 
present. 

Theorem 4.2 (Local boundedness) (See [50j Theorem 2.2.8].) Let A : X ^ X* be monotone 
with intdomA ^ 0. Then A is locally bounded at x £ intdomA, i.e., there exist 5 > and K > 
such that 

sup ||y*||<i^, \/y £ {x + 6Bx) ndom A. 

y*&Ay 

Proof. Let x G int dom A. After translating the graphs if necessary, we can and do suppose that 
X = Q and (0,0) G graA. Define / : X — > ]— oo, +oo] by 

?/H> sup {y-a,a*). 

(a,a*)£grayl, |ja|j<l 
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By Fact 2.1 'i) , / is convex and lower semicontinuous. Since G int dom j4. Then there exists 6i > 



such that 6iBx ^ dom^. Now we show that diBx ^ dom/. Let y G SiBx and y* G Ay. Thence, 
we have 

{y — a,y* — a*) > 0, V(a, a*) G graA, \\a\\ < 1 
^ {y — a, y*) > {y — a, a*), V(o, a*) G gra A, ||a|| < 1 
^+oo > (||y|| + l)- ||y*|| > {y-a,a*), V(a,a*) G gra^, ||a|| < 1 
=^ f{y) < +c« =^ y £ dom/. 



Hence 5i-Bx ^ dom / and thus G int dom/. By Fact 
such that 



2.1 



IV 



there is (5 > with 6 < minji, h^i} 



/(y) < /(O) + 1 = 1, yye26Bx. 



Thus, 



{y, a*) < {a, a*) + 1, Vy G 26Bx, (a, a*) G gra^, ||a|| < 6, 
whence, taking the supremum with y G 26Bx, 

26\\a*\\ < \\a\\ ■ \\a*\\ + 1 < 6\\a*\\ + 1, V(a,a*) G graA, a G 
=^ ||a*|| < T, V(a, a*) G graA, a G 



Setting := 4, we get the desired result. 



Generalizations of Theorem 4.2 can be found in |58[ I16j and |2H Lemma 4.1]. 



4.1 Sum theorem and Minty surjectivity theorem 

In the early 1960s, Minty [40J presented an important characterization of maximally monotone 
operators in a Hilbert space; which we now reestablish. The proof we give of Theorem |4.3| is due 
to Simons and Zalinescu \59\ Theorem 1.2]. We denote by Id the identity mapping from H to H. 

Theorem 4.3 (Minty) Suppose that H is a Hilbert space. Let A : H ^ H be monotone. Then 
A is maximally monotone if and only if ran( A + Id) = H. 

Proof "=^": Fix any Xq G H, and let B : H ^ H he given by grai? := giaA - {(0,Xo)}. Then B 
is maximally monotone. Define F : H x H ^ ]—oo, +oo] by 



(21) 



{x,x*) i-> Fb{x,x*) + + 



Fact 4.1 together with Fact 2.]|v) implies that F is coercive. By [62, Theorem 2.5.1(ii)], F has a 
minimizer. Assume that {z,z*) £ H x H is a minimizer of F. Then we have (0,0) G dF{z,z*). 
Thus, (0, 0) G dFB{z, z*) + (z, z*) and (-z, -z*) G OFb^z, z*). Then 



{{-z,-z*),{b,b*) - iz,z*)) < FBib,b*) - Fb{z,z*), V(5,6*) G grai?. 
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and by Fact |4.1 



{{-z,-z*),{b,b*)-iz,z*)) < {b,b*)-{z,z*), V(6,6*) egraS; 

that is, 

(22) < {b,b*) - {z,z*) + {z,b) + {z*,b*) - \\zf - \\z*f, V(5,6*) G graB. 

Hence, 

(b + z*,b* + z) = {b,b*) + {z,b) + {z*,b*) + {z,z*) > \\z + z*f > 0, V(6,6*) G graB, 



which imphes that {—z*,—z) G gra-B, since B is maximahy monotone. This combined with (22) 
imphes < —2{z,z*) — ||z|p — Then we have z = —z*, and {z,—z) = {—z*,—z) G gra-B, 

whence (z, —z) + (0, Xq) G gra A. Therefor Xg G + z, which implies Xq G ran(A + Id). 

"<^": Let {v,v*) £ H x H he monotonically related to giaA. Since ran(^ + Id) = H, there 
exists (y, y*) G giaA such that v* + v = y* + y. Then we have 

- yf = (v - y,y* + y - V - y*) = {v - y,v* - y*) > o. 

Hence v = y, which also implies v* = y* . Thus {v,v*) G gra^, and therefore A is maximally 
monotone. I 



Remark 4.4 The proof of Minty's theorem in reflexive spaces (in which case it asserts the sur- 
jectivity oi A + Jx for the normalized duality mapping Jx defined below) [201 Proposition 3.5.6, 
page 119] is only slightly more complicated than that of Theorem |4.3[ 

Let A and B be maximally monotone operators from X to X* . Clearly, the sum operator 
A -\- B: X ^ X* : x i— )• Ax + Bx := {a* + 6* | a* G Ax and b* G Bx^ is monotone. Rockafellar 
established the following important result in 1970 [5l], the so-called "sum theorem": Suppose that 
X is reflexive. If dom A n int dom B ^ 0, then A + B \s maximally monotone. We can weaken this 
constraint qualification to be that IJa>o ^ [dom^ — domS] is a closed subspace (see [581 1601 [20] ). 

We turn to a new proof of this generalized result. To this end, we need the following fact 
along with the definition of the partial inf-convolution. Given two real Banach spaces X, Y and 
Fi,F2: X xY ^ ]— oo, +oo], the partial inf-convolution FiOi2F2 is the function defined on X x y 

by 

F1U2F2: {x,y)^ inf {Fi{x,y-v)+F2{x,v)}. 

Fact 4.5 (Simons and Zaiinescu) (See [601 Theorem 4.2] or |58l Theorem 16.4(a)].) Let X,Y 

be real Banach spaces and Fi,F2: X xY ^ ]—oo, +00] be proper lower semicontinuous and convex 
bifunctionals. Assume that for every {x, y) €z X x Y , 

{FiU2F2){x,y) > -00 

and that (J^^q A [Px dom i^^i — PxdomF2] is a closed subspace of X . Then for every {x*,y*) G 
X* X Y*, 

{Fia2F2r{x*,y*)= min {F^x* - u* ,y*) + F^{u* ,y*)} . 

01* C X * 
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We denote by Jx the duality map from X to X* , which will be simply written as J, i.e., the 
subdifferential of the function ^|| • |p. Let F: X xY ^ ]— oo,+oo] be a bifunctional defined on 
two real Banach spaces. Following the notation by Penot [48j we set 

(23) F'':YxX:{y,x)^F{x,y). 

Theorem 4.6 (Sum theorem) Suppose that X is reflexive. Let A, B : X ^ X be maximally 
monotone. Assume that Ua>o ^ [dom A — domB] is a closed subspace. Then A + B is maximally 
monotone. 

Proof. Clearly, A + B is monotone. Assume that {z,z*) G X x X* is monotonically related to 
gTa{A + B). 

Let Fi := ^^□2-^5, and F2 := F^ . By |7i Lemma 5.8], Ua>o ^ [^x(domF4) - Px(domFB)] is 
a closed subspace. Then Fact |4.5| implies that 



(24) F^(x*,x)= mm {FVx* -u*,x)+Ff,(u*,x)}, for all (x, x*) G X x X*. 
SetG-.XxX*^ ]-oo, +00] by 

{x,x*) i-> ^2(3; + z,x* + z*) - {x,z*) - {z,x*) + + 

Assume that (xq, Xq) G X x X* is a minimizer of G. ( j62i Theorem 2.5.1(ii)] implies that minimizers 
exist since G is coercive). Then we have (0, 0) G dG{xQ, Xq). Thus, there exists v* G Jxq, v G Jx*Xq 
such that (0, 0) G dF2{xo + z,Xq + z*) + {v* , v) + {—z*, —z), and then 

(z* - v*,z - v) G dF2{xo + z,xl + z*). 

Thence 

(25) (^{z* - v*,z-v),{xo + z,xl + z*)'^ = F2{xo + z,x*Q + z*) + F^{z* -v*,z-v). 



Fact 4.1 and (24) show that 

i^2 >(•,•), FP = Fi>{;-). 



Then by (25), 



(^{z* -v*,z-v),{xo + z,xl + z*)'^ = F2{X0 + Z, X*Q + z*) + F^{z* -v*,z-v) 
(26) > {xo + z,xl + z*) + {z* -v*,z -v). 

Thus, since v* G Jxo,v G Jx*Xq, 

< 6 := (^{z* - v*,z - v), (xo + Xg + z*)^ - (xq + z, Xq + z*) - (z* - v*, z - 

= {-Xo-V,xl + V*) = (-Xo, Xq) - (xo, V*) - {V, Xq) - {v, V*) 
I *\ -'■II *ll2 -'■II ||2 ■'-II *||2 -'-II ||2 / *\ 

= (-xo,Xo) - -||xo|| --Ikoll II -7.PII -\v,v), 
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which imphes 

that is, 
(27) 



1.. *„2 1 



6 = and {xo,xl) + -\\xl\\ +-||xo 



and Xq G —Jx, 



UQ. 



Combining ([26j) and ([27j), we have ^2(^0 + z,Xq + z*) = [xq + z,Xq + z*). By ([24j) and Fact 
(28) {xo + z,x*Q + z*) £ gTa{A + B). 



4.1 



Since {z, z*) is monotonically related to gra{A + B), it follows from (28) that 



^Xq, Xq^ 



^Xo + Z - Z,Xq + z 



z*)>0, 



and then by (27) 



kSf >0, 



whence (xo,Xq) = (0,0). Finally, by (28), one deduces that {z^z*) £ gra(^ + B) and A + B is 
maximally monotone. ■ 



It is still unknown whether the reflexivity condition can be omitted in Theorem 4.6 though many 
partial results exist, see [12l [13] and [201 §9-7]. 

4.2 Autoconjugate functions 

Given F : X x X* ^ ]— 00, +00], we say that F is autoconjugate if F = F*t on X x X*. We say 
F is a representer for gra^ if 



(29) 



gra^ = {{x,x*) £ X X X* \ F{x,x*) = {x,x*)}. 



Autoconjugate functions are the core of representer theory, which has been comprehensively studied 
in Optimization (see 13 [H |M1 [20] ) • 

Fitzpatrick posed the following question in [31' Problem 5.5]: 

If A : X ^ X* is maximally monotone, does there necessarily exist an autoconjugate 
representer for A? 

Bauschke and Wang gave an affirmative answer to the above question in reflexive spaces by con- 



struction of the function Ba in Fact 4.7 This naturally raises a question: 



Is Ba still an autoconjugate representer for a maximally monotone operator ^ in a 
general Banach space? 



We give a negative answer to the above question in Example 4.12 in certain spaces, Ba fails to be 
autoconjugate. 
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Fact 4.7 (Bauschke and Wang) (See [H Theorem 5.7].) Suppose that X is reflexive. Let 
A: X ^ X* be maximally monotone. Then 

Ba - XxX* ^ ]-oo,+oo] 
(30) (x,x*)^ inf |iFA(x + y,a;* + y*) + iF7(x-y,x*-y*) + i||yf + i||y*||2| 

is an autoconjugate representer for A. 



We will make use of the following result to prove Theorem 4.11 



Fact 4.8 (Simons) (See [58', Corollary 10.4].) Let fi,f2,g- X — t- ]— oo,+oo] be proper convex. 
Assume that g is continuous at a point o/dom/i — dom/2. Suppose that 



h{x) := mf 1^/1(2; + z) + \f2{x -z) + \g{2z)] > -00, Vx G X. 



Then 



h*{x*)= min {i/*(x*+z*) + i/*(x*-z*) + i5*(-2z*)}, Vx* G X*. 

Let A : X ^ X* be a linear relation. We say that A is skew if graj4 C gra(— A*); equivalently, 
if {x,x*) = 0, V(x,x*) G graA. Furthermore, A is symmetric if graA C gray!*; equivalently, if 
{x, y*) = {y, X*), V(x, x*), (y, y*) £ gra^. We define the symmetric part and the skew part of A via 

(31) P:=lA+lA* and S — ^A-^A*, 
respectively. It is easy to check that P is symmetric and that S is skew. 

Fact 4.9 (See |4J Theorem 3.7].) Let A : X* — )■ X** be linear and continuous. Assume that 
ran^ C X and that there exists e G X**\X such that 

{Ax*,x*) = {e,x*f, Vx*gX*. 

Let P and S respectively be the symmetric part and skew part of A. Let T : X ^ X* be defined by 

(32) graT := {{-Sx*,x*) \ x* G X*,{e,x*) = O} = {{-Ax*,x*) \ x* G X*,{e,x*) = O}. 
Then the following hold. 

(i) A is a maximally monotone operator on X* . 

(ii) Px* = {x*,e)e, Vx* G X*. 

(iii) T is maximally monotone and skew on X . 

(iv) graT* = {{Sx* +re,x*) \ x* £ X* , r G M}. 

(v) Ft = Lc, where C := {{-Ax*,x*) \ x* G X*]. 
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We next give concrete examples of A, T as in Fact 4.9 



Example 4.10 (cq) (See IH Example 4.1].) Let X := cq, with norm || • ||oo so that X* = 
with norm || • ||i, and X** = i°° with its second dual norm || • Fix a := (an)nGN G with 
lim sup an / 0, and let : f ^ £°° be defined by 

(33) {AaX*)n := alxl + 2'^anaiX*, Vx* = (x*)„eN G 

i>n 

Now let Pa and Sa respectively be the symmetric part and skew part of Aa. Let Ta : cq ^ X* be 
defined by 

grar„ := {{-SaX*,x*) \ x* G X*,{a,x*) = O} = {{-AaX*,x*) \ x* GX*,{a,x*) = O} 

(34) = {{{-'^anaiX* + '^anaix*)nm,x*) \ x* £ X* , {a, x*) = O} . 

i>n i<n 

Then 



(i) {AaX*,x*) = (a,x*)^, Vx* = (x*)„eN G and ([34]) is well defined. 

(ii) Aa is a maximally monotone. 

(iii) Tq is a maximally monotone operator. 

(iv) Let G : £^ —?- £°° he Gossez's operator [32] defined by 

i>n i<n 

Then Te : cq ^ as defined by 

graTe := {(-G(x*),x*) | x* G ^\(x*,e) = 0} 
is a maximally monotone operator, where e := (1, 1, . . . , 1, . . .). 

We may now show that Bt need not be autoconjugate. 

Theorem 4.11 Let A : X* — t- X** he linear and continuous. Assume that ranvl C X and that 
there exists e G X**\X such that \\e\\ < and 

{Ax*,x*) = {e,x*f, Vx*gX*. 

Let P and S respectively be the symmetric part and skew part of A. Let T,C be defined as in 
Fact \4-S\ Then 



BT{-Aa*,a*) > B*Tia* , -Aa*), Va* ^ {e}_ 
In consequence, Bt is not autoconjugate. 

25 



,*TI 



Proof. First we claim that 

(35) i(j \xxX* = tgraT- 

Clearly, if we set D := {{A*x*,x*) \ x* G X*}, we have 

(36) i*J = crj, = ij,^ = iD, 

where in the second equality we use the fact that C is a subspace. Additionally, 

A*x* eX^{S + pyx* £X ^ S*x* + P*x* eX ^ -Sx* + Px* eX 

^ -Sx* - Px* + 2Px* £ X ^ 2Px* - Ax* £ X ^ Px* £ X (since ran A C X) 
^{x*,e)eeX (by Fact [X^pi)! ) 

(37) ^ {x*,e) = (since e^X). 

Observe that Px* = for ah x* £ {e}± by Fact |4.£|[ii)[ Thus, A*x* = -Ax* for aU x* G {e}_ 
Combining (36) and (37), we have 



and hence (35) holds. 

Let a* ^ {e}±. Then {a*,e) / 0. Now we compute BTi-Aa*,a*). By Fact |4.q[v)| and ([35]), 

BT{-Aa*,a*) 

(38) = inf {,c{-Aa* + y,a*+y*) + i^,,T{-Aa*-y,a*-y*) + l\\y\\^ + ^\\y*f} 

{y,y*)£XxX* 



Thus 



BTi-Aa*,a*)= inf {,^^,T{-Aa* - y,a* - y*) + l\\yf + l\\y*f} 

y=-Ay* 



y=-Ay 



,(a*—y*,e)=0 (a*— j/*,e)=0 



(39) 



> inf {Ay*,y*) = inf {e,y 

{a*-y*,e)=0 {a*-y*,e)=0 

= {e,a*f. 



*\2 



Next we will compute B^{a* , —via*). By Fact 4.8 and (38), we have 
BUa*,-Aa*) 

{ I'hia* + y*, -Aa* + y**) + ^^..^(a* - y* , -Aa* - y**) + l\\y**f + i||y*f | 



mm 

(r,r*)GX*xX' 



min liDi-Aa* +y**,a* + y*) + i(„s.T)->-{a* - y* , -Aa* - y**) + \ 

y*,j/**)eX*xX** 1^ \b I 



1 IU,**I|2 _|_ 1 IL,*||2 



{r,S/**)6X 

< iD(-^a* + 2Pa*,a*) + t(g,^;r)±(o*,-Aa* - 2Pa*) + \\\2Pa*f (by taking y* = 0,y** = 2Pa 
= Va(-T*)(-^a* - 2Pa*,a*) + ^||2Pa*f 



} (by ^) 
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l\\2Pa*f (by Fact [il^fhQ] ) 



2 

l\\2{a*,e)ef (by Fact [il^p)] ) 

,* „\2||„||2 



= 2(a*,e)^||e| 

This inequality along with (39), (e,a*) 7^ and ||e|| < yield 



BT{-Aa*,a*) > (e, a*f > 2{a\ef\\ef > B^a* , -Aa*), Va* ^ {e}±. 

Hence is not autoconjugate. I 

Example 4.12 (cq) Let X := cq, with norm || • ||oo so that X* = f with norm || • and X** = £°° 
with its second dual norm || • ||*. Fix a := (an)raeN G with limsupa^ 7^ and ||a||* < and 
let ^„ : £i ^ be defined by 

(40) iAaX*)n := a^x* + 2^a„QiX*, Vx* = (x*)neN e 

Now let Pa and respectively be the symmetric part and skew part of Aa. Let : cq ^ X* be 
defined by 

graTc, := {i-SaX*,x*) \ x* G X*,{a,x*) = O} = {(-A„x*,x*) | x* eX*,{a,x*) = 0} 

(41) = {((- ^a„aix* + ^anajX*)„eN,x*) | x* G X* , (q,x*) = O}. 

i>ra i<n 

Then 

BTj-Aa*,a*) > ^^Ja*,-^a*), Va* ^ {e^. 



In consequence, Bt^ is not autoconjugate. This is proved just applying Example 4.10 and Theo- 
rem 4.11| directly. 



The latter raises a very interesting question: 

Problem 4.13 Is there a maximally monotone operator on some (resp. every) non-reflexive Ba- 
nach space that has no autoconjugate representer? 

4.3 The Fitzpatrick function and differentiability 



The Fitzpatrick function introduced in |31j was discovered precisely to provide a more transparent 
convex alternative to the earlier saddle function construction due to Krauss [20] — we have not 
discussed saddle-functions but they produce interesting maximally monotone operators [54^ §33 
& §37]. At the time, Fitzpatrick's interests were more centrally in the differentiation theory for 
convex functions and monotone operators. 

The search for results relating when a maximally monotone T is single- valued to differentiability 
of Ft did not yield fruit, and he put the function aside. This is still the one area where to the best 
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of our knowledge Fj- has proved of very little help — in part because generic properties of domFy 
and of dom(T) seem poorly related. 

That said, monotone operators often provide efficient ways to prove differentiability of convex 
functions. The discussion of Mignot's theorem inpOj is somewhat representative of how this works 
as is the treatment in [50j. By contrast, as we have seen the Fitzpatrick function and its relatives 
now provide the easiest access to a gamut of solvability and boundedness results. 



5 Other results 

5.1 Renorming results: Asplund averaging 

Edgar Asplund [2\ showed how to exploit convex analysis to provide remarkable results on the 
existence of equivalent norms with nice properties. Most optimizers are unaware of his lovely idea 
which we recast in the language of inf-convolution. Our development is a reworking of that in 
Day [29]. Let us start with two equivalent norms || • ||i and || • ||2 on a Banach space X. We consider 
the quadratic forms po := || • ||f/2 and qo := \\ • II2/2, and average for n > by 

(42) Pn+i{x) := and qn+i{x) := . 

Let C > be such that (?o < Po ^ (1 + C)qo. By the construction of pn and g„, we have 
Qn ^ Pn 1^ (1 + 4~"'C)(7„ (O Lemma]) and so the sequences {pn)neN, (^n)neN converge to a common 
limit: a convex quadratic function p. 

We shall show that the norm || • II3 := y/2p typically inherits the good properties of both || • ||i 
and II • II 2. This is based on the following fairly straightforward result. 

Theorem 5.1 (Asplund) (See O Theorem 1].) If either pQ or qo is strictly convex, so is p. 

We make a very simple application in the case that X is reflexive. In [38j, Lindenstrauss showed 
that every reflexive Banach space has an equivalent strictly convex norm. The reader may consult 
|20| Chapter 4] for more general results. Now take || • ||i to be an equivalent strictly convex norm 
on X, and take || • II2 to be an equivalent smooth norm with its dual norm on X* strictly convex. 
Theorem 5.1 shows that p is strictly convex. We note that by Corollary [2^25] and Fact [ZM] 

SO that Theorem |5.1| applies to pg and Qq. Hence p* is strictly convex (see also [28t Proof of 
Corollary 1, page 111]). Hence || • ||3(:= \/2p) and its dual norm (:= \/2p*) are equivalent strictly 
convex norms on X and X* respectively. 

Hence || • Us is an equivalent strictly convex and smooth norm (since its dual is strictly convex). 
The existence of such a norm was one ingredient of Rockafellar's first proof of the Sum theorem. 



5.2 Resolvents of maximally monotone operators and connection with convex 
functions 

It is well known since Minty, Rockafellar, and Bertsekas-Eckstein that in Hilbert spaces, monotone 
operators can be analyzed from the alternative viewpoint of certain nonexpansive (and thus Lips- 
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chitz continuous) mappings, more precisely, the so-called resolvents. Given a Hilbert space H and 
a set- valued operator A: H ^ H* , the resolvent of A is 



JA:={ld+A)-\ 

The history of this notion goes back to Minty \W\ (in Hilbert spaces) and Brezis, Crandall and 
Pazy [27J (in Banach spaces). There exist more general notions of resolvents based on different 
tools, such as the normalized duality mapping, the Bregman distance or other maximally monotone 
operators (see [371 [U [2]). For more details on resolvents on Hilbert spaces see [3]. 



The Minty surjectivity theorem (Theorem 4.3 flO]) implies that a monotone operator is maximally 
monotone if and only if the resolvent is single- valued with full domain. In fact, a classical result due 
to Eckstein-Bertsekas [30j says even more. Recall that a mapping T : H ^ H is firmly nonexpansive 
if for all x,y £ H, \\Tx — Ty\\ < {Tx — ty,x — y). 

Theorem 5.2 Let H be a Hilbert space. An operator A: H ^ H* is (maximal) monotone if and 
only if J A is firmly nonexpansive (with full domain). 

Example 5.3 Given a closed convex set C H, the normal cone operator of C, Nq, is a maximally 
monotone operator whose resolvent can be proved to be the metric projection onto C. Therefore, 
Theorem |5.2| implies the firm nonexpansivity of the metric projection. 



In the particular case when A is the subdifferential of a possibly non-differentiable convex function 
in a Hilbert space, whose maximal monotonicity was established by Moreau |45j (in Banach spaces 
this is due to Rockafellar [53], see also |23| I20j). the resolvent turns into the proximal mapping in 
the following sense of Moreau. If / : — )■ ]— oo, +oo] is a lower semicontinuous convex function 
defined on a Hilbert space H, the proximal or proximity mapping is the operator proxj : H ^ H 
defined by 



proxj-(x) := argmin < f{y) H — \\x — y|p I 
ye// I 2 J 



yen 

This mapping is well-defined because proxj(x) exists and is unique for all x G H. Moreover, there 
exists the following subdifferential characterization: u = pTOXj{x) if and only if x — u € df{u). 

Moreau' s decomposition in terms of the proximal mapping is a powerful nonlinear analysis tool 
in the Hilbert setting that has been used in various areas of optimization and applied mathematics. 
Moreau established his decomposition motivated by problems in unilateral mechanics. It can be 
proved readily by using the conjugate and subdifferential. 

Theorem 5.4 (Moreau decomposition) Given a lower semicontinuous convex function f : 
H —7- ]— oo, +oo], for all x & H , 

x = proxj(x) +proxj, (x). 

Example 5.5 Note that for / := lq, with C closed and convex, the proximal mapping turns into 
the projection onto a closed and convex set C. Therefore, this result generalizes the decomposition 
by orthogonal projection on subspaces. In particular, if is a closed convex cone (thus = tj^-, 



see Example 2.18), Moreau's decomposition provides a characterization of the projection onto K: 
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X = y + z with y £ K , z £ K and {y, z) = <^ y = Prx and z = Pk-x- 

This illustrates that in Hilbert space, the Moreau decomposition can be thought of as generalizing 
the decomposition into positive and negative parts of a vector in a normed lattice [201 §6-7] to an 
arbitrary convex cone. 

There is another notion associated to an operator A, which is strongly related to the resolvent. 
That is the Yosida approximation of index A > or the Yosida \-regularization: 

Ax := {\ld+A-')-' = i(Id-JAA). 

If the operator A is maximally monotone, so is the Yosida approximation, and along with the 
resolvent they provide the so-called Minty parametrization of the graph of A that is Lipschitz 
continuous in both directions [55j : 

iJxA(z),Ax{z)) = {x,y) <^ z = x + y,{x,y) G graA. 

If A = df is the subdifferential of a proper lower semicontinuous convex function /, it turns out 
that the Yosida approximation of A is the gradient of the Moreau envelope of / exf, defined as the 
infimal convolution of / and || • |P/2A, that is, 

e./W:=/nU!=,„f{/(rf + i||x-,||^}. 

This justifies the alternative term Moreau- Yosida approximation for the mapping {df)x = 
(A Id This allows to obtain a proof in Hilbert space of the connection between the 
convexity of the function and the monotonicity of the subdifferential (see a proper lower 

semicontinuous function is convex if and only its Clarke subdifferential is monotone. 

It is worth mentioning that generally the role of the Moreau envelope is to approximate the 
function, with a regularizing effect since it is finite and continuous even though the function may 
not be so. This behavior has very useful implications in convex and variational analysis. 



5.3 Symbolic convex analysis 

The thesis work of Hamilton ^18j has provided a conceptual and effective framework (the SCAT 
Maple software) for computing conjugates, subdifferentials and infimal convolutions of functions of 
several variables. Key to this is the notion of iterated conjugation (analogous to iterated integration) 
and a good data structure. 

As a first example, with some care, the convex conjugate of the function 

/ sinh(3x)\ 
/ : X ^ log . ' 

\ smh X J 

can be symbolically nursed to obtain the result 

y ( y+^lQ-^y^ \ ( y/lG - ^y^ - 2 \ 
9--V^-2-'''^\ 4-2, j + 1^ 6 ) ' 
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with domain [—2,2]. 

Since the conjugate of g is much more easily computed to be /, this produces a symbohc com- 
putational proof that / and g are convex and are mutually conjugate. 

Similarly, Maple produces the conjugate of x i— ?■ exp(exp(x)) as y i— )• y (log (y) — W{y) — 1/W (y)) 
in terms of the Lambert 's W function — the multi- valued inverse of z i— t- ze^ . This function is un- 
known to most humans but is built into both Maple and Mathematica. Thus Maple knows that to 
order five 

g{y) = -1 + (-1 + logy) V - + " g?^^ + ^ l^^) ' 
Figure [3] shows the Map/e-computed conjugate after the SCAT package is loaded: There is a 



> f 11 : =convert (exp (exp (x) ) , PWF) ; 

> gll :=Conj (fll,y) ; 



all(x) 



gll 



00 

-1 



z 



(LambertW(j^)^ - LambertW(j^) ln(j;) + 1 ) 



LambertW(>') 



< 
y = 

<y 



> sdgll:=SubDiff (gll) ; 

sdgll := 



{} 3'<0 
{} y = G 

{ -LambertW(;') + ln(>') } Q <y 
> Plot (sdgll, y=-l. .l,view=[0. .1,-5. .0] , axes=boxed, labels= [ "$y$" , ""] 

); 




Figure 3: The conjugate and subdifferential of exp exp. 

corresponding numerical program CCAT [18j . Current work is adding the capacity to symbolically 
compute convex compositions — and so in principle Fenchel duality. 
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5.4 Partial Fractions and Convexity 

We consider a network objective function given by 



PN{q) ■ _ ^ 
summed over all Nl permutations; so a typical term is 





For example, with = 3 this is 

qmqs ( ^ \ ) ( ) ( - ) ( ^ \ + + - 

V9l +92 +93/ +93/ V9l + 92 +93 92 + 93 93, 

This arose as the objective function in research into coupon collection. The researcher, Ian Affleck, 
wished to show pjv was convex on the positive orthant. 

First, we tried to simplify the expression for p^v- The partial fraction decomposition gives: 

(43) pi(xi) = -, 

1 1 1 

P2{Xl,X2) = h 



Xl X2 Xl + X2 

1111 1 1 1 

P3{Xl,X2,X3) = \ \ ■ ■ ■ h 



Xl X2 X3 Xl + X2 X2 + X3 Xl + X3 Xl + X2 + X3 

Partial fraction decompositions are another arena in which computer algebra systems are hugely 



useful. The reader is invited to try performing the third case in (43) by hand. It is tempting to 
predict the "same" pattern will hold for = 4. This is easy to confirm (by computer if not by 
hand) and so we are led to: 



N 




Conjecture 5.6 For each N ^N, the function 

(44) Pn{xi,--- ,xn)= [ [1-11(1 

is convex; indeed 1/pn is concave. 

One may check symbolically that this is true for N < 5 via a large Hessian computation. But this 
is impractical for larger N. That said, it is easy to numerically sample the Hessian for much larger 
N, and it is always positive definite. Unfortunately, while the integral is convex, the integrand is 
not, or we would be done. Nonetheless, the process was already a success, as the researcher was 



able to rederive his objective function in the form of (44). 

A year after, Omar Hjab suggested re-expressing (44) as the joint expectation of Poisson distri- 
butions!^ Explicitly, this leads to: 



^ See "Convex, 11" SIAM Electronic Problems and Solutions at http://www.siam.org/journals/problems/ 
|downloadf iles/99-5sii .pdf 
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Lemma 5.7 ^15, §1.7] If x = {xi, • • • , Xn) is a point in the positive orthant 



then 



(45) 



i=l 



'tXi 



dt 




-{x,y} 



max(yi 



: Vn) dy, 



where {x, y) = xiyi + • • • + is the Euclidean inner product. 

It follows from the lemma — which is proven in [15j with no recourse to probability theory — that 



Pn{x) 



'(yiH hj/jv) 



max 



N 
++ 



m 

Xl 



VN 
XN 



dy, 



and hence that is positive, decreasing, and convex, as is the integrand. To derive the stronger 
result that 1/pn is concave we refer to [121 §1.7]. Observe that since |^ < -v/a6 < (a + b)/2, it 



a+b 

follows from (45) that ptv is log-convex (and convex). A little more analysis of the integrand shows 
Pat is strictly convex on its domain. The same techniques apply when x^ is replaced in (43) or (44) 
by g{xk) for a concave positive function g. 

There is still no truly direct proof of the convexity oi pM- Surely there should be! This develop- 
ment neatly shows both the power of computer assisted convex analysis and its current limitations. 

Lest one think most results on the real line are easy, we challenge the reader to prove the empirical 
observation that 

p I— )• ^/p 

is difference convex on (l,c«), i.e. it can be written as a difference of two convex functions [3]. 





sinx 


p 






dx 


Jo 


X 





6 Concluding comments 

All researchers and practitioners in convex analysis and optimization owe a great debt to Jean- 
Jacques Moreau — whether they know so or not. We are delighted to help make his seminal role 
more apparent to the current generation of scholars. For those who read French we urge them to 
experience the pleasure of j4H H2l H3l H5] and especially ^46j. For others, we highly recommend 
|47j . which follows [IS] and of which Zuhair Nashed wrote in his Mathematical Review MR0217617: 
"There is a great need for papers of this kind; the present paper serves as a model of clarity and 
motivation." 



Acknowledgments The authors were all partially supported by various Australian Research 
Council grants. 



References 

[1] Y. Alber and D. Butnariu, "Convergence of Bregman projection methods for solving consistent 
convex feasibility problems in reflexive Banach spaces". Journal of Optimization Theory and 
Applications, vol. 92, pp. 33-61, 1997. 



33 



E. Asplund, "Averaged norms", Israel Journal of Mathematics vol. 5, pp. 227-233, 1967. 

M. Bacak and J.M. Borwein, "On difference convexity of locally Lipschitz functions", Opti- 
mization, pp. 961-978, 2011. 

H.H. Bauschke, J.M. Borwein, X. Wang, and L. Yao, "Construction of pathological maximally 
monotone operators on non-reflexive Banach spaces". Set- Valued and Variational Analysis, 
vol. 20, pp. 387-415, 2012. 

H.H. Bauschke and P.L. Combettes, Convex Analysis and Monotone Operator Theory in 
Hilbert Spaces, Springer, 2011. 

H.H. Bauschke and X. Wang, "The kernel average for two convex functions and its applications 
to the extension and representation of monotone operators" , Transactions of the American 
Mathematical Society, vol. 36, pp. 5947-5965, 2009. 

H.H. Bauschke, X. Wang, and L. Yao, "Monotone linear relations: maximality and Fitzpatrick 
functions" , Journal of Convex Analysis, vol. 16, pp. 673-686, 2009. 

H.H. Bauschke, X. Wang, and L. Yao, "Autoconjugate representers for linear monotone oper- 
ators". Mathematical Programming (Series B), vol. 123, pp. 5-24, 2010. 

H.H. Bauschke, X. Wang, and L. Yao, "General resolvents for monotone operators: characteri- 
zation and extension" , in Biomedical Mathematics: Promising Directions in Imaging, Therapy 
Planning and Inverse Problems, Medical Physics Publishing, pp. 57-74, 2010. 

J.M. Borwein, "A generalization of Young's inequality". Mathematical Inequalities & Ap- 
plications, vol. 1, pp. 131-136, 1998. 

J.M. Borwein, "Maximal monotonicity via convex analysis". Journal of Convex Analysis, 
vol. 13, pp. 561-586, 2006. 

J.M. Borwein, "Maximality of sums of two maximal monotone operators in general Banach 
space". Proceedings of the American Mathematical Society, vol. 135, pp. 3917-3924, 2007. 

J.M. Borwein, "Fifty years of maximal monotonicity". Optimization Letters, vol. 4, pp. 473- 
490, 2010. 

J.M. Borwein and D.H. Bailey, Mathematics by Experiment: Plausible Reasoning in the 21st 
Century, A.K. Peters Ltd, Second expanded edition, 2008. 

J.M. Borwein, D.H. Bailey and R. Girgensohn, Experimentation in Mathematics: Computa- 
tional Paths to Discovery, A.K. Peters Ltd, 2004. ISBN: 1-56881-211-6. 

J.M. Borwein and S. Fitzpatrick, "Local boundedness of monotone operators under minimal 
hypotheses". Bulletin of the Australian Mathematical Society, vol. 39, pp. 439-441, 1989. 

J.M. Borwein, R.S Burachik, and L. Yao, "Conditions for zero duality gap in convex program- 
ming", submitted; http://arxiv.org/abs/1211.4953vl, November 2012. 



34 



J.M. Borwein and C. Hamilton, "Symbolic Convex Analysis: Algorithms and Examples," 
Mathematical Programming, 116 (2009), 17-35. Maple packages SCAT and CCAT available 
at |http : //carma . newcastle . edu . au/ConvexFunct ions/SCAf . ZIP[ 

J.M. Borwein and A.S. Lewis, Convex Analyis andd Nonsmooth Optimization, Second ex- 
panded edition, Springer, 2005. 

J.M. Borwein and J.D. Vanderwerff, Convex Functions, Cambridge University Press, 2010. 

J.M. Borwein and L. Yao, "Structure theory for maximally monotone operators with points of 
continuity". Journal of Optimization Theory and Applications, vol 156, 2013 (Invited paper) 
|http : //dx ■ doi . org/10 . 1007 /sl095 7-012-0162-y 



J.M. Borwein and L. Yao, "Recent progress on Monotone Operator Theory"; 
|http : //arxiv . org/abs/1210 . 3401v2t October 2012. 



J.M. Borwein and Q.J. Zhu, Techniques of variational analysis, CMS Books in Mathematic- 
s/Ouvrages de Mathmatiques de la SMC, 20. Springer- Verlag, New York, 2005. 

R.I. Bot S. Grad, and G. Wanka, Duality in Vector Optimization, Springer, 2009. 

R.I. Bot and G. Wanka, "A weaker regularity condition for subdifferential calculus and Fenchel 
duality in infinite dimensional spaces". Nonlinear Analysis, vol. 64, pp. 2787-2804, 2006. 

R.S. Burachik and A.N. lusem, Set-Valued Mappings and Enlargements of Monotone Opera- 
tors, Springer, vol. 8, 2008. 

H. Brezis, G. Crandall and P. Pazy, Perturbations of nonlinear maximal monotone sets in 
Banach spaces. Communications on Pure and Applied Mathematics, vol. 23, pp. 123-144, 
1970. 

J. Diestel, Geometry of Banach spaces. Springer- Verlag, 1975 

M.M. Day, Normed linear spaces. Third edition. Springer- Verlag, New York-Heidelberg, 1973. 

J. Eckstein and D.P. Bertsekas, "On the Douglas-Rachford splitting method and the proxi- 
mal point algorithm for maximal monotone operators". Mathematical Programming, vol. 55, 
pp. 293-318, 1992. 

S. Fitzpatrick, "Representing monotone operators by convex functions", in Workshop /Mini- 
conference on Functional Analysis and Optimization ( Canberra 1988), Proceedings of the Cen- 
tre for Mathematical Analysis, Australian National University, vol. 20, Canberra, Australia, 
pp. 59-65, 1988. 

J. -P. Gossez, "On the range of a coercive maximal monotone operator in a nonreflexive Banach 
space". Proceedings of the American Mathematical Society, vol. 35, pp. 88-92, 1972. 

J.-B. Hiriart-Urruty, M. Moussaoui, A. Seeger, and M. Voile, "Subdifferential calculus without 
qualification conditions, using approximate subdifferentials: a survey". Nonlinear Analysis, 
vol. 24, pp. 1727-1754, 1995. 



35 



J.-B. Hiriart-Urruty and R. Phelps, "SubdifFerential Calculus Using e-Subdifferentials" , Jour- 
nal of Functional Analysis vol. 118, pp. 154-166, 1993. 

L. Hormander, "Sur la fonction d'appui des ensembles convexes dans un espace localement 
convexe", Arkiv for Matematik , vol. 3, pp. 181-186, 1955. 

V. Klee, "Convexity of Chebysev sets", Mathematische Annalen, vol. 142, pp. 292-304, 1961. 

F. Kohsaka and W. Takahashi, "Existence and approximation of fixed points of firmly nonex- 
pansivetype mappings in Banach spaces" , SIAM Journal on Optimization, vol. 19, pp. 824-835, 
2008. 

J. Lindenstrauss, "On nonseparable reflexive Banach spaces" , Bulletin of the American Math- 
ematical Society, vol. 72, pp. 967-970, 1966. 

P. Marechal, "A convexity theorem for multiplicative functions". Optimization Letters, vol. 6, 
pp. 357-362, 2012. 

G. Minty, "Monotone (nonlinear) operators in a Hilbert space", Duke Mathematical Journal, 
vol. 29, pp. 341-346, 1962. 

J.J. Moreau, "Fonctions convexes en dualite", Faculte des Sciences de Montpellier, Seminaires 
de Mathematiques Universite de Montpellier, Montpellier, 1962. 

J.J. Moreau, "Fonctions a valeurs dans [— oo,+oo]; notions algebriques" , Faculte des Sciences 
de Montpellier, Seminaires de Mathematiques, Universite de Montpellier, Montpellier, 1963. 

J.J. Moreau, "Etude locale d'une fonctionnelle convexe", Faculte des Sciences de Montpellier, 
Seminaires de Mathematiques Universite de Montpellier, Montpellier, 1963. 

J.J. Moreau, "Sur la function polaire d'une fonctionelle semi-continue superieurement" , 
Comptes Rendus de I'Academie des Sciences, vol. 258, pp. 1128-1130, 1964. 

J.J. Moreau, "Proximite et dualite dans un espace hilbertien". Bulletin de la Societe 
Mathematique de France, vol. 93, pp. 273-299, 1965. 

J.J. Moreau, Fonctionnelles convexes, Seminaire Jean Leray, College de France, Paris, 



pp. 1-108, 1966-1967. Available at |http : / / carma . newcastle . edu . au/ ConvexFunct ions/ 
|moreau66-67 . pdf [ 

J.J. Moreau, "Convexity and duality", pp. 145-169 in Functional Analysis and Optimization, 
Academic Press, New York, 1966. 

J. -P. Penot, "The relevance of convex analysis for the study of monotonicity" , Nonlinear 
Analysis, vol. 58, pp. 855-871, 2004. 

J. -P. Penot and C. Zalinescu, "Some problems about the representation of monotone opera- 
tors by convex functions". The Australian New Zealand Industrial and Applied Mathematics 
Journal, vol. 47, pp. 1-20, 2005. 



36 



[50] R.R. Phelps, Convex Functions, Monotone Operators and Differentiability, 2nd Edition, 
Springer- Verlag, 1993. 

[51] R.R. Phelps and S. Simons, "Unbounded linear monotone operators on nonreflexive Banach 
spaces". Journal of Nonlinear and Convex Analysis, vol. 5, pp. 303-328, 1998. 

[52] R.T. Rockafellar, "Extension of Fenchel's duality theorem for convex functions" , Duke Math- 
ematical Journal, vol. 33, pp. 81-89, 1966. 

[53] R.T. Rockafellar, "On the maximal monotonicity of subdifferential mappings" , Pacific Journal 
of Mathematics, vol. 33, pp. 209-216, 1970. 

[54] R.T. Rockafellar, "On the maximality of sums of nonlinear monotone operators" , Transactions 
of the American Mathematical Society, vol. 149, pp. 75-88, 1970. 

[55] R.T. Rockafellar and R. J-B Wets, Variational analysis. Grundlehren der Mathematischen Wis- 
senschaften [Fundamental Principles of Mathematical Sciences], 317. Springer- Verlag, Berlin, 
1998 (3rd Printing, 2009). 

[56] R. Rudin, Functional Analysis, Second Edition, McGraw-Hill, 1991. 

[57] S. Simons, Minimax and Monotonicity, Springer- Verlag, 1998. 

[58] S. Simons, From Hahn-Banach to Monotonicity, Springer- Verlag, 2008. 

[59] S. Simons and C. Zalinescu, "A new proof for Rockafellar's characterization of maximal mono- 
tone operators". Proceedings of the American Mathematical Society, vol. 132, pp. 2969-2972, 
2004. 

[60] S. Simons and C. Zalinescu, "Fenchel duality, Fitzpatrick functions and maximal monotonic- 
ity". Journal of Nonlinear and Convex Analysis, vol. 6, pp. 1-22, 2005. 

[61] L. Sucheston, "Banach limits", American Mathematical Monthly, vol. 74, pp. 308-311, 1967. 

[62] C. Zalinescu, Convex Analysis in General Vector Spaces, World Scientific Publishing, 2002. 

[63] E. Zeidler, Nonlinear Functional Analysis and its Applications II/ A: Linear Monotone Oper- 
ators, Springer- Verlag, 1990. 

[64] E. Zeidler, Nonlinear Functional Analysis and its Applications II/B: Nonlinear Monotone 
Operators, Springer- Verlag, 1990. 



37 



